Code
COFFEE_COST = 6.0
WORKING_DAYS = 250
annual_cost = COFFEE_COST * WORKING_DAYSA masochistic exploration of personal finance
Nicholas Dorsch
June 12, 2025
I moved to Melbourne last year, which, along with being a questionable career choice, turned an already troubling dependence on caffeine into a behavioural disorder.
I bought a Nespresso machine—telling myself I would save money on coffee—and it has been a huge success. Now, with the help of my Nespresso machine I have more than doubled my coffee consumption. The coffee pod coffee doesn’t count, you see. Yes I’ll come down for a coffee, I haven’t had mine yet.
This leads me to wonder how much of my family’s future I am eroding to bean-dust, every time I head downstairs for another medium latte, thanks—the absent droning sound I make every morning between the hours of 9 and 10 am. Each milky-upper is costing me $6. So assuming I work 250 days in a year… let’s see now:
It’s $1500.0. You can check the code if you want to make sure I got that.
I’m not devastated by that number—it’s not a small amount, but neither is it enough to send my first born to school—and perhaps a less masochistic person would leave the matter there… but it does beg the question of what I might be doing with that $1500 a year if I wasn’t spending it on morning browns for morning frowns.
To estimate at how much this behaviour is costing me, I’ll put together a model that compares two strategies:
After 20 years, we’ll see how Nick the coffee dribbler is doing, compared to Nick the slightly more financially responsible investor.
Before things get more complicated, let’s have a look at how these strategies perform assuming an 8% annual return on the market:
def create_date_array(start_date: datetime, num_months: int):
start = np.datetime64(start_date.replace(day=1), 'M')
month_array = start + np.arange(num_months)
return month_array.astype("datetime64[D]")
def annual_to_monthly_rate(annual_rate: float) -> float:
return (1 + annual_rate) ** (1 / 12) - 1
def calculate_compound_returns(
investments: np.ndarray,
annual_rate: float
) -> np.ndarray:
num_months = len(investments)
monthly_return = annual_to_monthly_rate(annual_rate)
returns = np.zeros(num_months)
for i in range(num_months):
months_invested = np.arange(num_months - i)
returns[i:] += investments[i] * (1 + monthly_return) ** months_invested
return returns
YEARS = 20
ANNUAL_RATE = 0.08
num_months = YEARS * 12
months = np.arange(num_months)
dates = create_date_array(datetime.today(), num_months=num_months)
investing = np.full_like(dates, COFFEE_COST * 30).astype(float)
spending = -1 * investing
returns = calculate_compound_returns(
investments=investing,
annual_rate=ANNUAL_RATE
)
df = pd.DataFrame({
"Date": dates,
"Coffee": np.cumsum(spending),
"Investing": np.cumsum(investing),
"Returns": returns - np.cumsum(investing),
}).round(2)
flat_df = df.melt(
id_vars=["Date"],
var_name="Case",
value_name="$"
)def plot_balance_area_chart(df: pd.DataFrame, title: str):
return (
alt.Chart(df)
.mark_area(opacity=0.75)
.encode(
alt.X("Date:T", title=""),
alt.Y("$:Q"),
alt.Color(
"Case:N",
legend=alt.Legend(orient="top-left")
),
tooltip=[
alt.Tooltip("Date:T", title=""),
alt.Tooltip("$:Q", title="$", format="$,.2f"),
alt.Tooltip("Case:N", title="")
],
order=alt.Order("Case:N")
)
.properties(
title=title,
height=PLOT_HEIGHT,
width=PLOT_WIDTH
)
)
chart = plot_balance_area_chart(
flat_df, title="Comparison of Investing vs Coffee Drinking"
)
chart.show()Unless I’ve screwed up the maths here, things are already looking very grim. Nick the dribbler is down $43200 in 20 years, whereas Nick the investor is giving his kid a very nice $102419 graduation bonus.
Ok, but what about inflation, and the time value of money, and all that finance stuff? asks Nick the dribbler, pretending he isn’t an idiot.
Well the time value of money is a bit of a flimsy concept when the present money is being spent on hot milk that will cool down in minutes and expire in a matter of days, but let’s see.
Assuming 2.5% inflation annually, and a discount rate of 13.5%, which look like this:
def create_exponential_curve(
months: np.ndarray,
annual_rate: float
) -> np.ndarray:
monthly_rate = annual_to_monthly_rate(annual_rate)
return np.exp(months * monthly_rate)
DISCOUNT_RATE = -0.135
INFLATION_RATE = 0.025
discount_curve = create_exponential_curve(months, DISCOUNT_RATE)
inflation_curve = create_exponential_curve(months, INFLATION_RATE)
def plot_factor_curves(discount_curve, inflation_curve) -> alt.Chart:
return (
alt.Chart(
pd.DataFrame({
"Date": dates,
"Inflation": inflation_curve,
"Discount": discount_curve,
"Net Adjustment": inflation_curve * discount_curve
})
.melt(
id_vars=["Date"],
var_name="Curve",
value_name="Factor"
)
)
.mark_line()
.encode(
alt.X("Date:T"),
alt.Y("Factor:Q"),
alt.Color(
"Curve:N",
legend=alt.Legend(orient="top-left"),
),
tooltip=[
alt.Tooltip("Date:T", title=""),
alt.Tooltip("Factor:Q", format=".3f"),
alt.Tooltip("Curve:N", title="")
],
)
.properties(
height=PLOT_HEIGHT,
width=PLOT_WIDTH
)
)
chart = plot_factor_curves(discount_curve, inflation_curve)
chart.show()… Investor Nick’s returns look like this:
discount_df = pd.DataFrame({
"Date": dates,
"Coffee": np.cumsum(spending) * inflation_curve * discount_curve,
"Investing": np.cumsum(investing) * inflation_curve * discount_curve,
"Returns": (returns - np.cumsum(investing)) * inflation_curve * discount_curve,
}).round(2)
flat_discount_df = discount_df.melt(
id_vars=["Date"],
var_name="Case",
value_name="$"
)
chart = plot_balance_area_chart(
flat_discount_df,
title="(Real, Discounted at 13.5%): Comparison of Investing vs Coffee Drinking"
)
chart.show()…So what does that mean?
If present me values future dollars at a discount rate of 13.5%, I am basically claiming that money on the table today can be converted into a 13.5% return in a year, through my… savvy investing strategy.
That means that future dollars are worth less to me (not worthless, worth less). The sooner I have those dollars, the sooner I can put them to work to make me that return.
And compounding this out to 20 years from now implies that a dollar then is worth about 5% of its value to me today.
This applies to dollars spent as well as dollars invested—future expenses are smaller when discounted too.
So what discount rate could justify Nick the dribbler’s behaviour? Well, he isn’t investing that money, which means he must really, really value that 3 minute mouth experience, which would imply a very aggressive discount rate, probably in excess of 100%.
discount_df = pd.DataFrame({
"Date": dates,
"Coffee": np.cumsum(spending) * inflation_curve * discount_curve,
"Investing": np.cumsum(investing) * inflation_curve * discount_curve,
"Returns": (returns - np.cumsum(investing)) * inflation_curve * discount_curve,
}).round(2)
flat_discount_df = discount_df.melt(
id_vars=["Date"],
var_name="Case",
value_name="$"
)
chart = plot_balance_area_chart(
flat_discount_df,
title="(Real, Discounted at 100%): Comparison of Investing vs Coffee Drinking"
)
chart.show()So, any money beyond a year time horizon is worth jack shit to Nick the dribbler. He lives by the froth, dies by the froth, neglects his financial future by the froth.
Clearly this isn’t rational, and realistically speaking I would expect an individual to have a discount rate of a little over some risk-free return—like treasury bonds at around 4.5%—up to around 9%, depending on their risk tolerance. Here’s 6%:
discount_curve = create_exponential_curve(months, annual_rate=-0.06)
discount_df = pd.DataFrame({
"Date": dates,
"Coffee": np.cumsum(spending) * inflation_curve * discount_curve,
"Investing": np.cumsum(investing) * inflation_curve * discount_curve,
"Returns": (returns - np.cumsum(investing)) * inflation_curve * discount_curve,
}).round(2)
flat_discount_df = discount_df.melt(
id_vars=["Date"],
var_name="Case",
value_name="$"
)
chart = plot_balance_area_chart(
flat_discount_df,
title="(Real, Discounted at 6%): Comparison of Investing vs Coffee Drinking"
)
chart.show()But you can’t just assume a 8.0% year on year return like that! Market performance isn’t guaranteed! protests Nick the dribbler, milk froth and steam spilling from his maw as if Smaug the dragon had recently left the Lonely Mountain, put on some weight and taken up residence in South Melbourne.
And fair enough! From a risk perspective it’s worth examining what range of outcomes someone is exposed to when thinking about saving for the future. It’s also not as daunting a task as it might first appear, especially if we settle for a relatively simple to implement price model.
First, we need data.
Here is the last 15 years of ASX200 market data:
import yfinance as yf
price_df = (
yf.Ticker("^AXJO")
.history(period="15y")
.reset_index()
)
chart = (
alt.Chart(price_df)
.mark_line(color="red", strokeWidth=1)
.encode(
alt.X("Date:T"),
alt.Y("Close:Q")
)
.properties(
title='ASX 200 History',
height=PLOT_HEIGHT,
width=PLOT_WIDTH,
)
)
chart.show()For our forecast, we’ll need to transform this data into something a bit more convenient, the log returns, defined as:
\[ \begin{align} \ln R_t = \ln \frac{P_t}{P_{t-1}} \end{align} \]
price_df["Returns"] = (
np.log(
price_df["Close"] / price_df["Close"].shift(1)
)
)
returns_chart = (
alt.Chart(price_df)
.mark_line(color="limegreen", strokeWidth=1)
.encode(
alt.X("Date:T"),
alt.Y("Returns:Q")
)
.properties(
title='ASX 200 Log Returns',
height=PLOT_HEIGHT,
width=PLOT_WIDTH,
)
)
returns_chart.show()Because we are more or less taking the derivative of price—the degree to which prices change day to day—we end up with a nice, relatively stationary signal of the market. Removing the time axis gives us a dataset that I can fit a distribution to:
params = stats.norm.fit(price_df["Returns"].dropna())
dist = stats.norm(*params)
x_range = np.linspace(
price_df["Returns"].min(),
price_df["Returns"].max(),
250
)
chart = stat_plots.hist_dist_plot(
price_df,
col="Returns",
scipy_dist=dist,
title="Normal Distribution Fitted to Returns",
hist_color="limegreen"
)
chart.show()But the normal distribution used above is doing a pretty poor job of capturing the data. A Student’s T distribution will better capture the tails:
params = stats.t.fit(price_df["Returns"].dropna())
dist = stats.t(*params)
x_range = np.linspace(
price_df["Returns"].min(),
price_df["Returns"].max(),
250
)
chart = stat_plots.hist_dist_plot(
price_df,
col="Returns",
scipy_dist=dist,
title="Student's T Distribution Fitted to Returns",
hist_color="limegreen"
)
chart.show()At this point it looks like we have a good-enough fit, but it is worth plotting samples back on a time-axis so we can directly compare what our model produces to actual historical returns:
sample_df = pd.DataFrame({
"Date": price_df["Date"],
"Simulated Returns": dist.rvs(size=len(price_df["Date"]))
})
sample_chart = (
alt.Chart(sample_df)
.mark_line(color="orange", strokeWidth=1)
.encode(
alt.X("Date:T"),
alt.Y("Simulated Returns:Q")
)
)
combined_chart = (sample_chart + returns_chart).resolve_axis(x="shared", y="shared")
combined_chart.show()… and it does okay. The elephant in the room is COVID-19. Independent samples from a distribution can’t capture clustered periods of volatility like that. The samples are also a bit fuzzier in general—there is more volatility in the simulated market than was seen in reality. Overall I give the model a:

Now that we have a distribution, we can simulate forecasts from it.
First, since the distribution is fitted to daily returns and our forecast will be monthly, an adjustment is needed:
def convert_t_dist_to_monthly(t_dist, days_per_month=21):
df, mu, sigma = t_dist.args[0], t_dist.args[1], t_dist.args[2]
# Monthly transformation
mu_monthly = days_per_month * mu
sigma_monthly = sigma * np.sqrt(days_per_month)
return stats.t(df=df, loc=mu_monthly, scale=sigma_monthly)
monthly_dist = convert_t_dist_to_monthly(dist)Right, now the fun stuff.
It turns out that you can cook up a particular stochastic forecast called a Geometric Brownian Motion (GBM) model quite easily with a numpy magic spell:
sims = 1000
initial_balance = COFFEE_COST
def simulate_monthly_gbm(
monthly_returns_dist,
initial_balance: float,
num_months: int,
sims: int = 100
) -> np.ndarray:
return initial_balance * (
np.exp(
np.cumsum(
monthly_returns_dist.rvs(
size=(num_months, sims)
),
axis=0
)
)
)
balance = simulate_monthly_gbm(
monthly_dist,
initial_balance=COFFEE_COST,
num_months=num_months,
sims=sims
)In words this is the exponentiated cumulative sum of log returns, or excumsumlogret.

When you hold out your coffee cup and say the magic word, many universes spout forth, in which the $6.0 purchase of that coffee is instead invested into the market.
Looking at the end point of all those universes 20 years from now, we get a distribution of that coffee’s final value:
df = pd.DataFrame({
"Sim": np.arange(sims),
"$": balance[-1, :]
})
chart = (
alt.Chart(df)
.mark_bar()
.encode(
alt.X("$:Q", bin=alt.Bin(maxbins=250)),
alt.Y(
"count()",
title="",
axis=alt.Axis(labels=False, ticks=False, title=None),
)
)
.properties(
title='Final Coffee Value',
height=PLOT_HEIGHT,
width=PLOT_WIDTH,
)
)
chart.show()Now that we have a way of simulating forecasts, I can run the investing strategy (putting the price of a cup of coffee into the market each day) through the model to see how the portfolio might perform over the 20 year period.
def simulate_compounded_returns(
investments: np.ndarray,
monthly_returns_dist,
num_months: int,
) -> np.ndarray:
returns = np.exp(
monthly_returns_dist.rvs(size=(num_months, sims))
)
portfolio_values = np.zeros_like(returns)
for i in range(num_months):
# Initial investment
if i == 0:
portfolio_values[i] = investments[i] * returns[i]
# New investment + previously invested
else:
portfolio_values[i] = (
(portfolio_values[i-1] + investments[i]) * returns[i]
)
return portfolio_values
portfolio = simulate_compounded_returns(
investments=investing,
monthly_returns_dist=monthly_dist,
num_months=num_months
)df = pd.DataFrame({
"Sim": np.arange(sims),
"$": portfolio[-1, :]
})
chart = (
alt.Chart(df)
.mark_bar()
.encode(
alt.X("$:Q", bin=alt.Bin(maxbins=250)),
alt.Y(
"count()",
title="",
axis=alt.Axis(labels=False, ticks=False, title=None),
)
)
.properties(
title='Final Portfolio Value',
height=PLOT_HEIGHT,
width=PLOT_WIDTH,
)
)
chart.show()And if we discount at the conservative rate of 6%:
df = pd.DataFrame({
"Sim": np.arange(sims),
"$": discounted_portfolio[-1, :]
})
chart = (
alt.Chart(df)
.mark_bar()
.encode(
alt.X("$:Q", bin=alt.Bin(maxbins=250)),
alt.Y(
"count()",
title="",
axis=alt.Axis(labels=False, ticks=False, title=None),
)
)
.properties(
title='(Discounted) Final Portfolio Value',
height=PLOT_HEIGHT,
width=PLOT_WIDTH,
)
)
chart.show()| P99 | P90 | P50 | P10 | P1 | |
|---|---|---|---|---|---|
| Nominal Returns | 92008.0 | 139778.0 | 266728.0 | 547752.0 | 1130568.0 |
| Discounted Returns | 26915.0 | 40890.0 | 78027.0 | 160235.0 | 330728.0 |
So we don’t know for sure how the market will perform over the next 20 years, but it looks like a good bet to drink less coffee and invest a bit more.
That feels like a very underwhelming conclusion, after all that effort.
Even though all models are wrong and every prediction is contingent on a set of assumptions and all that… I think the power of doing this is to stoke the imagination about what might be possible beyond the typical point estimate of 20.54% return over 10 years, especially when it comes to thinking about exposure—to risks as well as opportunities.
Having a single prediction doesn’t help you to think about the range of things that might happen, but a stochastic model like the simple forecast above does. Now you can think about how likely it is you’ll lose all your money, or how likely it is that curbing your coffee habit will make you a millionaire, or anything in between.
And what about the consequences of that range of outcomes? The lower tail predictions of the model might have extremely dire consequences that lead you not to act, for fear of the repercussions. The upper tail might be so attractive that it becomes obvious that the exposure is worth the entry price, even if the “expectation” doesn’t look that great. This kind of reasoning isn’t possible without uncertainty quantification.
The robustness of the model is always going to be a concern, but it seems to me a lot less flimsy than a single prediction. I’m not sure quantitative predictions about the future make much sense at all unless they quantify uncertainty somehow. Why draw a particular scenario out of a hat when there are an infinite set of others to choose from?
This also makes stochastic models more dangerous in a sense—having a distribution as output might lead you or your colleagues to believe you’ve actually captured the possibilities objectively. You haven’t. What you have is the formalized consequences of your chosen assumptions—in this case a basic representation of past behaviour of the market propagated into the future—and that can still be useful, but it’s not the same as knowing what will actually happen. Point estimates don’t carry that pretense, they’re dumb but they’re honest.